MapReduce Scheduler: A 360-degree view
نویسندگان
چکیده
Undoubtedly, the MapReduce is the most powerful programming paradigm in distributed computing. The enhancement of the MapReduce is essential and it can lead the computing faster. Therefore, there are many scheduling algorithms to discuss based on their characteristics. Moreover, there are many shortcoming to discover in this field. In this article, we present the state-of-the-art scheduling algorithm to enhance the understanding of the algorithms. The algorithms are presented systematically such that there can be many future possibilities in scheduling algorithm through this article. In this paper, we provide in-depth insight on the MapReduce scheduling algorithm. In addition, we discuss various issues of MapReduce scheduler developed for largescale computing as well as heterogenous environment.
منابع مشابه
A Throughput Driven Task Scheduler for Batch Jobs in Shared MapReduce Environments
MapReduce is one of the most popular parallel data processing systems, and it has been widely used in many fields. As one of the most important techniques in MapReduce, task scheduling strategy is directly related to the system performance. However, in multi-user shared MapReduce environments, the existing task scheduling algorithms cannot provide high system throughput when processing batch jo...
متن کاملUsing Pattern Classification for Task Assignment in MapReduce
MapReduce has become a popular paradigm for large scale data processing in the cloud. The sheer scale of MapReduce deployments make task assignment in MapReduce an interesting problem. The scale of MapReduce applications presents unique opportunity to use data driven algorithms in resource management. We present a learning based scheduler that uses pattern classification for utilization oriente...
متن کاملHadoop Map Reduce Job Scheduler Implementation and Analysis in Heterogeneous Environment
Hadoop MapReduce is one of the popular framework for BigData analytics. MapReduce cluster is shared among multiple users with heterogeneous workloads. When jobs are concurrently submitted to the cluster, resources are shared among them so system performance might be degrades. The issue here is that schedule the tasks and provide the fairness of resources to all jobs. Hadoop supports different s...
متن کاملFLEX: A Slot Allocation Scheduling Optimizer for MapReduce Workloads
Originally, MapReduce implementations such as Hadoop employed First In First Out (fifo) scheduling, but such simple schemes cause job starvation. The Hadoop Fair Scheduler (hfs) is a slot-based MapReduce scheme designed to ensure a degree of fairness among the jobs, by guaranteeing each job at least some minimum number of allocated slots. Our prime contribution in this paper is a different, fle...
متن کاملSimulation and performance evaluation of the hadoop capacity scheduler
Hadoop task schedulers like Fair Share and Capacity have been specially designed to share hardware resources among multiple organizations. The Capacity Scheduler provides a complex set of parameters to give fine control over resource allocation of a shared MapReduce cluster. Administrators and users often run into performance problems because they do not understand the performance influence of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1704.02632 شماره
صفحات -
تاریخ انتشار 2016